distance estimation
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Spain > Canary Islands (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (17 more...)
Environment-Aware Indoor LoRaWAN Ranging Using Path Loss Model Inversion and Adaptive RSSI Filtering
Obiri, Nahshon Mokua, Van Laerhoven, Kristof
Achieving sub-10 m indoor ranging with LoRaWAN is difficult because multipath, human blockage, and micro-climate dynamics induce non-stationary attenuation in received signal strength indicator (RSSI) measurements. We present a lightweight, interpretable pipeline that couples an environment-aware multi-wall path loss model with a forward-only, innovation-driven Kalman prefilter for RSSI. The model augments distance and wall terms with frequency, signal-to-noise ratio (SNR), and co-located environmental covariates (temperature, relative humidity, carbon dioxide, particulate matter, and barometric pressure), and is inverted deterministically for distance estimation. On a one-year single-gateway office dataset comprising over 2 million uplinks, the approach attains a mean absolute error (MAE) of 4.74 m and a root mean square error (RMSE) of 6.76 m in distance estimation, improving over a COST-231 multi-wall baseline (12.07 m MAE) and its environment-augmented variant (7.76 m MAE. Filtering reduces RSSI volatility from 10.33 to 5.43 dB and halves path loss error to 5.35 dB while raising R-squared from 0.82 to 0.89. The result is a single-anchor LoRaWAN ranging method with constant per-packet cost that is accurate, robust, and interpretable, providing a strong building block for multi-gateway localization.
Performance Evaluation of an Integrated System for Visible Light Communication and Positioning Using an Event Camera
Soga, Ryota, Kobayashi, Masataka, Shimizu, Tsukasa, Shiba, Shintaro, Kong, Quan, Lu, Shan, Yamazato, Takaya
Event cameras, featuring high temporal resolution and high dynamic range, offer visual sensing capabilities comparable to conventional image sensors while capturing fast-moving objects and handling scenes with extreme lighting contrasts such as tunnel exits. Leveraging these properties, this study proposes a novel self-localization system that integrates visible light communication (VLC) and visible light positioning (VLP) within a single event camera. The system enables a vehicle to estimate its position even in GPS-denied environments, such as tunnels, by using VLC to obtain coordinate information from LED transmitters and VLP to estimate the distance to each transmitter. Multiple LEDs are installed on the transmitter side, each assigned a unique pilot sequence based on Walsh-Hadamard codes. The event camera identifies individual LEDs within its field of view by correlating the received signal with these codes, allowing clear separation and recognition of each light source. This mechanism enables simultaneous high-capacity MISO (multi-input single-output) communication through VLC and precise distance estimation via phase-only correlation (POC) between multiple LED pairs. To the best of our knowledge, this is the first vehicle-mounted system to achieve simultaneous VLC and VLP functionalities using a single event camera. Field experiments were conducted by mounting the system on a vehicle traveling at 30 km/h (8.3 m/s). The results demonstrated robust real-world performance, with a root mean square error (RMSE) of distance estimation within 0.75 m for ranges up to 100 m and a bit error rate (BER) below 0.01 across the same range.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Switzerland (0.04)
- Asia > Japan > Honshū > Chūbu > Aichi Prefecture > Nagoya (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (18 more...)
NeoARCADE: Robust Calibration for Distance Estimation to Support Assistive Drones for the Visually Impaired
Raj, Suman, Madhabhavi, Bhavani A, Kumar, Madhav, Gupta, Prabhav, Simmhan, Yogesh
Autonomous navigation by drones using onboard sensors, combined with deep learning and computer vision algorithms, is impacting a number of domains. We examine the use of drones to autonomously follow and assist Visually Impaired People (VIPs) in navigating urban environments. Estimating the absolute distance between the drone and the VIP, and to nearby objects, is essential to design obstacle avoidance algorithms. Here, we present NeoARCADE (Neo), which uses depth maps over monocular video feeds, common in consumer drones, to estimate absolute distances to the VIP and obstacles. Neo proposes robust calibration technique based on depth score normalization and coefficient estimations to translate relative distances from depth map to absolute ones. It further develops a dynamic recalibration method that can adapt to changing scenarios. We also develop two baseline models, Regression and Geometric, and compare Neo with SOTA depth map approaches and the baselines. We provide detailed evaluations to validate their robustness and generalizability for distance estimation to VIPs and other obstacles in diverse and dynamic conditions, using datasets collected in a campus environment. Neo predicts distances to VIP with an error <30cm, and to different obstacles like cars and bicycles within a maximum error of 60cm, which are better than the baselines. Neo also clearly out-performs SOTA depth map methods, reporting errors up to 5.3-14.6x lower.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > India > Karnataka > Bengaluru (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (10 more...)
- Transportation (1.00)
- Information Technology > Robotics & Automation (0.88)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Integrating Spatial and Semantic Embeddings for Stereo Sound Event Localization in Videos
Berghi, Davide, Jackson, Philip J. B.
In this study, we address the multimodal task of stereo sound event localization and detection with source distance estimation (3D SELD) in regular video content. 3D SELD is a complex task that combines temporal event classification with spatial localization, requiring reasoning across spatial, temporal, and semantic dimensions. The last is arguably the most challenging to model. Traditional SELD approaches typically rely on multichannel input, limiting their capacity to benefit from large-scale pre-training due to data constraints. To overcome this, we enhance a standard SELD architecture with semantic information by integrating pre-trained, contrastive language-aligned models: CLAP for audio and OWL-ViT for visual inputs. These embeddings are incorporated into a modified Conformer module tailored for multimodal fusion, which we refer to as the Cross-Modal Conformer. We perform an ablation study on the development set of the DCASE2025 Task3 Stereo SELD Dataset to assess the individual contributions of the language-aligned models and benchmark against the DCASE Task 3 baseline systems. Additionally, we detail the curation process of large synthetic audio and audio-visual datasets used for model pre-training. These datasets were further expanded through left-right channel swapping augmentation. Our approach, combining extensive pre-training, model ensembling, and visual post-processing, achieved second rank in the DCASE 2025 Challenge Task 3 (Track B), underscoring the effectiveness of our method. Future work will explore the modality-specific contributions and architectural refinements.
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
- Europe > United Kingdom > England > Surrey > Guildford (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.67)
Warehouse Spatial Question Answering with LLM Agent
Huang, Hsiang-Wei, Cheng, Jen-Hao, Chen, Kuang-Ming, Yang, Cheng-Yen, Alattar, Bahaa, Lin, Yi-Ru, Kim, Pyongkun, Kim, Sangwon, Kim, Kwangju, Huang, Chung-I, Hwang, Jenq-Neng
Spatial understanding has been a challenging task for existing Multi-modal Large Language Models~(MLLMs). Previous methods leverage large-scale MLLM finetuning to enhance MLLM's spatial understanding ability. In this paper, we present a data-efficient approach. We propose a LLM agent system with strong and advanced spatial reasoning ability, which can be used to solve the challenging spatial question answering task in complex indoor warehouse scenarios. Our system integrates multiple tools that allow the LLM agent to conduct spatial reasoning and API tools interaction to answer the given complicated spatial question. Extensive evaluations on the 2025 AI City Challenge Physical AI Spatial Intelligence Warehouse dataset demonstrate that our system achieves high accuracy and efficiency in tasks such as object retrieval, counting, and distance estimation. The code is available at: https://github.com/hsiangwei0903/SpatialAgent
- Asia > Taiwan (0.05)
- North America > United States (0.04)
- Asia > South Korea > Daegu > Daegu (0.04)
ShrinkBox: Backdoor Attack on Object Detection to Disrupt Collision Avoidance in Machine Learning-based Advanced Driver Assistance Systems
Shahzad, Muhammad Zaeem, Hanif, Muhammad Abdullah, Ouni, Bassem, Shafique, Muhammad
Advanced Driver Assistance Systems (ADAS) significantly enhance road safety by detecting potential collisions and alerting drivers. However, their reliance on expensive sensor technologies such as LiDAR and radar limits accessibility, particularly in low- and middle-income countries. Machine learning-based ADAS (ML-ADAS), leveraging deep neural networks (DNNs) with only standard camera input, offers a cost-effective alternative. Critical to ML-ADAS is the collision avoidance feature, which requires the ability to detect objects and estimate their distances accurately. This is achieved with specialized DNNs like YOLO, which provides real-time object detection, and a lightweight, detection-wise distance estimation approach that relies on key features extracted from the detections like bounding box dimensions and size. However, the robustness of these systems is undermined by security vulnerabilities in object detectors. In this paper, we introduce ShrinkBox, a novel backdoor attack targeting object detection in collision avoidance ML-ADAS. Unlike existing attacks that manipulate object class labels or presence, ShrinkBox subtly shrinks ground truth bounding boxes. This attack remains undetected in dataset inspections and standard benchmarks while severely disrupting downstream distance estimation. We demonstrate that ShrinkBox can be realized in the YOLOv9m object detector at an Attack Success Rate (ASR) of 96%, with only a 4% poisoning ratio in the training instances of the KITTI dataset. Furthermore, given the low error targets introduced in our relaxed poisoning strategy, we find that ShrinkBox increases the Mean Absolute Error (MAE) in downstream distance estimation by more than 3x on poisoned samples, potentially resulting in delays or prevention of collision warnings altogether.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States (0.04)
- Europe > Italy > Lazio > Rome (0.04)
- Asia > Nepal (0.04)
- Information Technology > Security & Privacy (0.92)
- Automobiles & Trucks > Manufacturer (0.61)
- Transportation > Ground > Road (0.34)
Real-Time Fusion of Visual and Chart Data for Enhanced Maritime Vision
Kreis, Marten, Kiefer, Benjamin
This paper presents a novel approach to enhancing marine vision by fusing real-time visual data with chart information. Our system overlays nautical chart data onto live video feeds by accurately matching detected navigational aids, such as buoys, with their corresponding representations in chart data. T o achieve robust association, we introduce a transformer-based end-to-end neural network that predicts bounding boxes and confidence scores for buoy queries, enabling the direct matching of image-domain detections with world-space chart markers. The proposed method is compared against baseline approaches, including a ray-casting model that estimates buoy positions via camera projection and a YOLOv7-based network extended with a distance estimation module. Experimental results on a dataset of real-world maritime scenes demonstrate that our approach significantly improves object localization and association accuracy in dynamic and challenging environments.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.76)
- North America > United States (0.46)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Government (0.46)
- Transportation (0.46)